Dictionary-based Word Segmentation for Javanese
نویسندگان
چکیده
منابع مشابه
Unsupervised Word Segmentation Without Dictionary
This prototype system demonstrates a novel method of word segmentation based on corpus statistics. Since the central technique we used is unsupervised training based on a large corpus, we refer to this approach as unsupervised word segmentation. The unsupervised approach is general in scope and can be applied to both Mandarin Chinese and Taiwanese. In this prototype, we illustrate its use in wo...
متن کاملNon-Dictionary-Based Thai Word Segmentation Using Decision Trees
For languages without word boundary delimiters, dictionaries are needed for segmenting running texts. This figure makes segmentation accuracy depend significantly on the quality of the dictionary used for analysis. If the dictionary is not sufficiently good, it will lead to a great number of unknown or unrecognized words. These unrecognized words certainly reduce segmentation accuracy. To solve...
متن کاملDictionary Based Image Segmentation
We propose a method for weakly supervised segmentation of natural images, which may contain both textured or non-textured regions. Our texture representation is based on a dictionary of image patches. To divide an image into separated regions with similar texture we use an implicit level sets representation of the curve, which makes our method topologically adaptive. In addition, we suggest a m...
متن کاملDictionary Based Segmentation in Volumes
We present a method for supervised volumetric segmentation based on a dictionary of small cubes composed of pairs of intensity and label cubes. Intensity cubes are small image volumes where each voxel contains an image intensity. Label cubes are volumes with voxelwise probabilities for a given label. An unknown volume is segmented by cube-wise finding the most similar dictionary intensity cube....
متن کاملVoting between Dictionary-Based and Subword Tagging Models for Chinese Word Segmentation
This paper describes a Chinese word segmentation system that is based on majority voting among three models: a forward maximum matching model, a conditional random field (CRF) model using maximum subword-based tagging, and a CRF model using minimum subwordbased tagging. In addition, it contains a post-processing component to deal with inconsistencies. Testing on the closed track of CityU, MSRA ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Procedia Computer Science
سال: 2016
ISSN: 1877-0509
DOI: 10.1016/j.procs.2016.04.051